You can customize the selections to fit your needs.
Sign up for the latest product releases, news, and tips.OpenVINO Base Package
Full inference with basic GenAIOpenVINO Base Package
Full inference with basic GenAI2024.3
recommendedNightly
2023.3 LTS
2022.3.2 LTS
Includes NCS2/HDDl supportWe're entering an era where AI-focused hardware and software advances make AI PC a reality. Intel provides highly optimized developer support for AI workloads by including the OpenVINO™ toolkit on your PC.
Seamlessly transition projects from early AI development on the PC to cloud-based training to edge deployment. More easily move AI workloads across CPU, GPU, and NPU to optimize models for efficient deployment. With OpenVINO, you can accelerate AI inference, achieve lower latency, and increase throughput while maintaining accuracy.
Unlock AI features such as real-time language translation, automation inferencing, and enriched gaming experiences.
The Intel® Core™ Ultra processor accelerates AI on the PC by combining a CPU, GPU, and NPU through a 3D performance hybrid architecture, together with high-bandwidth memory and cache.
Intel® Core™ desktop processors optimize your gaming, content creation, and productivity.
Expore additional configurations for GPUs and NPUs to get the most out of the OpenVINO toolkit.
Download this comprehensive whitepaper on LLM optimization using compression techniques. Learn to use the OpenVINO toolkit to compress LLMs, integrate them into AI applications, and deploy them on your PC with maximum performance.
Notebooks and Demos
Learn and experiment with the OpenVINO toolkit using these preconfigured Jupyter* Notebooks.
Craft chatbots powered by an LLM using the OpenVINO toolkit.
Run an instruction-following text-generation pipeline.
Learn about image generation using the LCM and the OpenVINO toolkit.
Experience automatic speech recognition with this model and the OpenVINO toolkit.
Venture into text-to-image generation and infinite zoom capabilities with Stable Diffusion* v2 and the OpenVINO toolkit.
Use BLIP for visual language processing tasks like visual question answering and image captioning.
Discover a single-stage, autoregressive transformer model that produces high-quality music samples based on text descriptions or audio prompts.
Learn how to convert and optimize YOLOv8* models.
Predict the 2D position and orientation of each person in an image or a video.
Explore all available OpenVINO notebooks to unlock even more possibilities for optimized deep learning inference.
Embark on your AI development journey with beginner-friendly video tutorials. Gain valuable insights from experts and prepare to advance your skills.
(4:26)
(5:58)
Load optimized models from the Hugging Face Hub and create pipelines to run inference with OpenVINO Runtime without rewriting your APIs.
Learn More
Discover how Intel® Core™ Ultra processors enable you to use the power of CPU, GPU, and NPU to accelerate AI development on the PC.
The OpenVINO™ toolkit enables you to optimize a deep learning model from almost any framework and deploy it with best-in-class performance on a range of Intel® processors and other hardware platforms.
Improve your proficiency in deep learning by building your own project from start to finish.
The OpenVINO™ toolkit 2024.3 release enhances generative AI (GenAI) accessibility with improved large language model (LLM) performance and expanded model coverage. It also boosts portability and performance for deployment anywhere: at the edge, in the cloud, or locally. The top features of this release are:
Easier Model Access and Conversion
|
Product |
Details |
|---|---|
|
New Model Support |
Support for Phi-3-mini, a family of AI models that takes advantage of the power of small language models for faster, more accurate, and cost-effective text processing. Llama 3 optimizations for CPUs, built-in GPUs, and discrete GPUs for improved performance and efficient memory usage. |
|
Python* |
A Python custom operation is now enabled in the OpenVINO toolkit, making it easier for Python developers to code their custom operations instead of using C++ custom operations (also supported). This custom operation empowers you to implement your own specialized operations into any model. |
Generative AI and LLM Enhancements
Expanded model support and accelerated inference.
|
Product |
Details |
|---|---|
|
New Jupyter Notebooks |
An expansion to Jupyter Notebooks ensures better coverage for new models. The following noteworthy notebooks were added:
|
|
Performance Improvements for LLMs |
A GPTQ method for 4-bit weight compression was added to the Neural Networks Compression Framework (NNCF) for more efficient inference and improved performance of compressed LLMs. There are significant LLM performance improvements and reduced latency for built-in and discrete GPUs. |
More Portability and Performance
Develop once, deploy anywhere. OpenVINO toolkit enables developers to run AI at the edge, in the cloud, or locally.
|
Product |
Details |
|---|---|
|
Model Serving Enhancements |
Preview: The OpenVINO model server now supports an OpenAI*-compatible API, continuous batching, and PagedAttention, which enables significantly higher throughput for parallel inferencing, especially on Intel® Xeon® processors that serve LLMs to many concurrent users. The OpenVINO toolkit back end for the NVIDIA Triton* Inference Server now supports dynamic input shapes. TorchServe was integrated through torch.compile on the OpenVINO toolkit back end for easier model deployment, provisioning to multiple instances, model versioning, and maintenance. |
|
Intel Hardware Support |
A significant improvement in second-token latency and memory footprint for FP16-weight LLMs on Intel® Advanced Vector Extensions 2 (for 13th gen Intel® Core™ processors) and Intel® Advanced Vector Extensions 512 (for 3rd gen Intel® Xeon® Scalable processors) that are based on CPU platforms, particularly for small batch sizes. Preview: Support for the Intel® Xeon® 6 processor. |
|
Generate API |
Preview: Addition of Generate API, a simplified API for text generation using LLMs with only a few lines of code. The API is available through the newly launched OpenVINO Toolkit GenAI Package. |
OpenVINO™ toolkit is an open source toolkit that accelerates AI inference with lower latency and higher throughput while maintaining accuracy, reducing model footprint, and optimizing hardware use. It streamlines AI development and integration of deep learning in domains like computer vision, large language models (LLM), and generative AI.
What's New in 2024.3
Learn with like-minded AI developers by joining live and on-demand webinars focused on GenAI, LLMs, AI PC, and more, including code-based workshops using Jupyter* Notebook.
Convert and optimize models trained using popular frameworks like TensorFlow* and PyTorch*. Deploy across a mix of Intel® hardware and environments, on-premise and on-device, in the browser, or in the cloud.
Get started with OpenVINO and all the resources you need to learn, try samples, see performance, and more.
Review optimization and deployment strategies using the OpenVINO toolkit. Plus, use compression techniques with LLMs on your PC.
This is a commercial software platform that enables enterprise teams to develop vision AI models faster. With the platform, companies can build models with minimal data, and with OpenVINO integration, facilitate deploying solutions at scale.
Explore the Capabilities of the Intel® Geti™ PlatformWhen you are ready to go to market with your solution, explore ISV solutions that are built on OpenVINO. This ebook is designed to help you find a solution that best addresses your use-case needs, organized into sections, such as banking or healthcare, to help you navigate the solutions table easier.
Explore the AI Inference Catalog
Take advantage of add-ons that extend the possibilities of the toolkit, and implement existing and new functionality now available in the core toolkit.
Estimate deep learning inference performance on supported devices.
Use this add-on to build, transform, and analyze datasets.
This cross-platform, command-line tool facilitates the transition between training and deployment environments, performs static model analysis, and adjusts deep learning models for optimal performance on end-point target devices.
Use this framework based on PyTorch for quantization-aware training.
Hugging Face* has a repository for the OpenVINO toolkit that provides resources and models aimed at optimizing deep learning models for inference on Intel hardware.
This scalable inference server is for serving models optimized with the Intel® Distribution of OpenVINO™ toolkit.
Subscribe below to stay up to date with the latest Intel offerings.
All fields are required unless marked optional.
Explore ways to get involved and stay up-to-date with the latest announcements.
Optimize, fine-tune, and run comprehensive AI inference using the included model optimizer and runtime and development tools.
The productive smart path to freedom from the economic and technical burdens of proprietary alternatives for accelerated computing.